Collected in river Umeälven (and tributary Vindelälven) and Dalälven
Total fecundity available (stripped + dissected). I.e. All eggs are counted!
Also trout data that is filtered out.
Sweden 2 - Swedish west coast
Collected in rearing station in river Göta älv
All fin-clipped individuals.
Only stripped (from what V.T. knows, methods description lacking atm)
Not total fecundity as in Sweden 1
Finland 1
Described in the report “HYDROACOUSTIC ASSESSMENT OF SALMON IN THE RIVER TORNIONJOKI - FINAL REPORT, EU STUDY PROJECT 96-069”
Total fecundity available (stripped + dissected). I.e. an estimation of all eggs!
From river Tornionjoki
Both reared (finclipped) and wild individuals (adipose fin intact) but the majority are wild.
Finland 2
From river Tornionjoki and Simojoki
wild, reared and NA mix of individuals
France
Described in the Samarch report “Changes in sex ratio and fecundity of salmonids” (M. Nevoux et al. 2020, Deliverable D3.3.1)
Methods for assessing fecundity is stripping and a subset of the volume (or weight) of eggs were counted, then the total fecundity was extrapolated based on the total volume (weight) of the stripped eggs (pers. com M. Nevoux)
From 13 rivers in three regions
Origin should mainly be wild (pers. com. M. Nevoux)
The data is filtered in Section 4 when combining data.
2. Spatial aggregations - Baltic Assessment and Atlantic Stock units and Regions
To add spatial information, sai_location name in the data is matched (dplyr::left_join():ed) with sai_location name in the tables created below (SweFin.rivers and French.rivers).
The spatial information added is:
Coordinates (WGS84 DD). French coordinates either provided by Hilaire Drouineau or added manually by V.T., they are at the river mouth (entering the sea or from a tributary) or for the Baie Mont St Michelle in the center of the bay. These are not catch places. For many tributary rivers, the main river mouth is used and for those rivers,the names are changed in 3a when combining data. Swedish coordinates eother existing in the data or added manually (river mouth, not catch place).
Stock unit (for WGNAS stock unit Sweden, Ireland, and France),
Assessment unit (for WGBAST),
Region and genetic region from French regions from Perrier et al. 2011 (doi: 10.1111/j.1365-294X.2011.05266.x), and provided by M. Nevoux, existing in the french fecundity data or added manually by V.T with input from M. Nevoux. Swedish regions are defined as Baltic Sea and Swedish west coast and Finnish is Baltic Sea.
All Irish data share one coordinate which represents where the Burrishoole enters the sea.
Show code
# Assessment units of the index rivers in Sweden and Finland. AU.rivers =bind_cols(sai_location =c("Tornionjoki","Simojoki","Kalixälven","Råneälven","Piteälven","Åbyälven","Byskeälven","Rickleån","Sävarån","Vindelälven","Öreälven","Lögdeälven","Ljungan","Mörrumsån","Emån", "Kågeälven","Testeboån", "Umeälven", "Dalälven", "Luleälven","Muonionjoki"),asses.unit =c(1,1,1,1,2,2,2,2,2,2,2,2,3,4,4,2,3,2,3,2,1), stock.origin ="wild") %>%bind_rows(bind_cols(sai_location =c("Torneälven_hatchery","Luleälven_(RG_with_Pite)","Iijoki","Oulujoki","Skellefteälven","Umeälven_(RG_with_Vindel)","Ångermanälven","Indalsälven_(RG_with_Ljungan)","Ljusnan","Dalälven_(RG_with_Testeboån)", "Torneälven","Gideälven"), asses.unit =c(1,2,1,1,2,2,3,3,3,3,1,2), stock.origin ="reared")) # And the Swedish rivers entering the the Western sea, i.e. WGNAS stock unit "Sweden".SU.rivers =bind_cols(sai_location =c("Ätran","Örekilsälven","Göta älv","Lagan","Västerhavet (hela) ICES SD 20-21","Genevadsån","Fylleån","Stensån"),stock.unit ="Sweden", stock.origin =NA)# Add AU and SU to lat lons from Swedish sötebasenSweFin.rivers <- swe.sallaa %>%drop_na(length_mm) %>%distinct(sai_location, WGS84_N_Vatten, WGS84_E_Vatten) %>%rename(fisa_y_4326 = WGS84_N_Vatten,fisa_x_4326 = WGS84_E_Vatten) %>%mutate(fisa_y_4326 =case_when(sai_location =="Östersjön (hela) ICES SD 22-32"~58.475309,.default = fisa_y_4326),fisa_x_4326 =case_when(sai_location =="Östersjön (hela) ICES SD 22-32"~19.780140,.default = fisa_x_4326)) %>%bind_rows(data.frame(sai_location =c("Tornionjoki", "Simojoki", "Muonionjoki")) %>%mutate(fisa_y_4326 =case_when(sai_location %in%c("Tornionjoki", "Muonionjoki") ~65.879905, sai_location =="Simojoki"~65.625639, sai_location =="Östersjön (hela) ICES SD 22-32"~58.475309, sai_location =="Gideälven"~63.327482,.default =NA),fisa_x_4326 =case_when(sai_location %in%c("Tornionjoki", "Simojoki", "Muonionjoki") ~24.136424, sai_location =="Simojoki"~25.052169, sai_location =="Östersjön (hela) ICES SD 22-32"~19.780140, sai_location =="Gideälven"~19.140244,.default =NA))) %>%left_join(AU.rivers) %>%left_join(SU.rivers) %>%mutate(region =if_else(stock.unit =="Sweden", "Swedish.westcoast", "Baltic.sea", missing ="Baltic.sea" )) # French rivers by region and sub-regions from Marie Nevouxfra.rivers <-read_csv(file ="/Users/vitl0001/Documents/Projects/DIASPARA/riviere_region_France.csv") %>%mutate(sai_location =str_to_title(river),stock.unit ="France")# French sai_location abbreviations for regional genotypic aggregations from Perrier et al. 2011fra.rivabb <-read_delim(file ="/Users/vitl0001/Documents/Projects/DIASPARA/french_genotypes_Perrier.txt", delim ="\t") %>%rename(sai_location = River) %>%mutate(sai_location.abb =str_to_upper(str_sub(sai_location,start =1, end =3)),sai_location.abb =case_when(sai_location =="NIVE"~"NIE", sai_location =="NIVELLE"~"NIL",.default = sai_location.abb),sai_location =str_to_title(sai_location)) %>%mutate(region.gen =case_when(sai_location.abb %in%c("COU","TRI","DOU","LEG","STE","AUL","GOY","ELO","ELL","PEN","ODE","AVE","JET","SCO","BLA") ~"Brittany", sai_location.abb %in%c("ORN", "VIR","SEI","SAI","SIE","SEL","SEE") ~"Lower-Normandy", sai_location.abb %in%c("NIL","NIE","GAV") ~"Adour", sai_location.abb %in%c("GAR","DOR","ALL") ~"Allier-Gironde", sai_location.abb %in%c("TOU","VAL","AUT","CAN","BRE","ARQ") ~"Upper-Normandy",.default =NA))# French sai_location coordinates from Hilaire Drouineaufra.rivers.sf <-read_sf("/Users/vitl0001/Documents/Projects/DIASPARA/salmon_frarivers", stringsAsFactors =FALSE)# Combine all French sai_location infofra.rivers2 <- fra.rivers %>%full_join(fra.rivabb) %>%bind_rows(# adding missing rivers from fra.sallaatibble(sai_location =c("Isole", "Etel", "Quillec", "Horn", "St Laurent"), region.gen ="Brittany", stock.unit ="France"),tibble(sai_location =c("Baie Du Mont Saint Michel","Thar"), region.gen ="Lower-Normandy", stock.unit ="France"),tibble(sai_location =c("Loire"), region.gen ="Allier-Gironde", stock.unit ="France"),tibble(sai_location =c("Durdent"), region.gen ="Upper-Normandy", stock.unit ="France")) %>%left_join(st_coordinates(fra.rivers.sf) %>%as.data.frame() %>%rename(fisa_x_4326 = X,fisa_y_4326 = Y) %>%bind_cols(st_set_geometry(fra.rivers.sf, NULL)) %>%rename(sai_location ="river")) %>%# assign region.gen to those missingmutate(region.gen =case_when(sai_location =="Oir"~"Lower-Normandy", subregion =="Adour"&is.na(sai_location.abb) ~"Adour", subregion =="Gironde"&is.na(sai_location.abb) ~"Allier-Gironde", region =="Bretagne"&is.na(sai_location.abb) ~"Brittany",.default = region.gen)) %>%# remove those " + affl" which are tributaries, Gave Mauleon (Le Saison) which is Gave Mauleon, "Gave'oloron duplictae and See Selune which are exists individually.filter(!sai_location %in%c("See Selune","Odet + Affl","Elle + Affl","Gave Mauleon (Le Saison)", "Gave D'oloron") ) %>%# add fisa_y_4326 and lons where missing.mutate(fisa_y_4326 =case_when(sai_location =="Baie Du Mont Saint Michel"~48.655943, sai_location =="Valmont"~49.761966, sai_location =="Seine"~49.435474, sai_location =="Isole"~47.874431, sai_location =="Loire"~47.281585, sai_location =="Finistere"~48.306467, sai_location =="Etel"~47.656579, sai_location =="Quillec"~48.685033, sai_location =="Etel"~47.656579, sai_location =="Thar"~48.800103, sai_location =="Loire"~47.281585, sai_location =="Horn"~48.688119, sai_location =="Durdent"~48.687806, sai_location =="Canche"~50.527333, sai_location =="Couesnon"~48.625250, sai_location =="St Laurent"~47.903795,.default = fisa_y_4326),fisa_x_4326 =case_when(sai_location =="Baie Du Mont Saint Michel"~-1.656370, sai_location =="Valmont"~0.377126, sai_location =="Seine"~0.285060, sai_location =="Isole"~-3.546855, sai_location =="Loire"~-2.152414, sai_location =="Finistere"~-4.080223, sai_location =="Etel"~-3.209520, sai_location =="Quillec"~-4.069429, sai_location =="Thar"~-1.568264, sai_location =="Loire"~-2.152414, sai_location =="Horn"~-4.058391, sai_location =="Durdent"~0.608712, sai_location =="Canche"~1.614964, sai_location =="Couesnon"~-1.511461, sai_location =="St Laurent"~-3.945979,.default = fisa_x_4326)) %>%# remove non-necessary infoselect(-subregion,-region,-id,-sai_location.abb)
3. Length at age
3a. Filter, clean and combine data
Remove NA values in length AND age.
Remove individuals from Swedish lakes
Correct an obvious outlier in the Swedish length data
Remove three mark-recaptured individuals in Swedish data
Remove an obvious outlier in the French data
Correct an obvious outlier in the French data
Calculate total length (\(L_t\)) from fork length (\(L_f\)) where needed in the French and Latvian data based on the model: \(exp(0.2892351)*L_f^{0.9623479}\) (based on French salmon where we have both Lt and Lf, see below)
The french data has two sources of sex identification (observed in the field vs genetic). I use Genetic sex where available and complete these data with field observations. M. Nevoux considers field observations correct only from (mid) August. From the data where both genetic and field method are available, 14% are incorrect. Keep this in mind if using this info. The sex in Finnish (and likely in Swedish data) is determined visually when gutting the fish and far from all are determined (many NAs).
The kept variables are: length, sai_location, sai_cou_code, origin, fi_year, age_ad, sex. Spatial varaibles are added to the data from the tables created in 2.
Show code
# 14 % of the sex determinations in th field are wrong.fra.sallaa %>%drop_na(`Genetic sex`, `Sex observed in the field`) %>%rename(gs =`Genetic sex`,fs =`Sex observed in the field`) %>%filter(gs != fs) %>%summarise(perc.incorr =100*n()/nrow(fra.sallaa %>%drop_na(`Genetic sex`, `Sex observed in the field`)))
drop_na: removed 68,383 rows (95%), 3,738 rows remaining
rename: renamed 2 variables (fs, gs)
filter: removed 3,031 rows (81%), 707 rows remaining
drop_na: removed 68,383 rows (95%), 3,738 rows remaining
summarise: now one row and one column, ungrouped
# A tibble: 1 × 1
perc.incorr
<dbl>
1 18.9
Show code
# Model to convert fork length in the French data to total length. V.T. models this relationship using a log-linear model (Lt = a*Lf^b) to estimate a and b. Lt ~ Lf is almost linear but a log(Lt)~log(Lf) relationship makes for a seemingly better fit.fra.sallaa %>%filter(total_length >100) %>%drop_na(total_length, fork_length) %>%lm(log(total_length) ~log(fork_length), data = .) %>%tidy() %>%pull(estimate)
#looks good:fra.sallaa %>%filter(total_length >100) %>%# an obvious outlier where Lf >> Ltdrop_na(total_length, fork_length) %>%ggplot(aes(fork_length, total_length)) +geom_point() +geom_line(aes(x = fork_length, y =exp(0.2892351)*fork_length^0.9623479), col ="red") +labs(title ="fork to total length fit length at age")
# There are three mark-recaptured individuals in Swedish db Sötebasaen when NA lengths and ages are removed. As they are so few: Id them and remove the first age-length measurement and keep later one (when they are bigger). They are all recaptured within the same year.dup.markrec <- swe.sallaa %>%# remove NAsdrop_na(length_mm, sea_age_year, MärkeNr) %>%# find the mark-recaptured ones by counting rows by tag number mutate(n =n(), .by = MärkeNr) %>%# filter the mark-recaps and those with info that is not tags ("finclipped")filter(n >1&!MärkeNr =="Fenklippt") %>%# id the mark (shorter) instanceslice_min(length_mm, by = MärkeNr)
drop_na: removed 105,549 rows (>99%), 300 rows remaining
mutate: new variable 'n' (integer) with 3 unique values and 0% NA
filter: removed 294 rows (98%), 6 rows remaining
slice_min: removed 3 rows (50%), 3 rows remaining
Show code
# fix and combine swedish and finnish dataswefin.sallaa <- swe.sallaa %>%# remove the mark-recapsanti_join(dup.markrec) %>%# remove the lakes in the datafilter(!sai_location %in%c("Vättern", "Vänern")) %>%drop_na(length_mm) %>%mutate(length_mm =ifelse(length_mm >2000, length_mm/10, length_mm)) %>% dplyr::select(sai_cou_code, fi_year, sai_location, origin, length_mm, weight_g, AU, sea_age_year, juvenile_age_year, sex, date) %>%bind_rows(fin.sallaa %>%# remove individuals without length or sea agedrop_na(length_mm) %>% dplyr::select("sai_cou_code", "fi_year", "sai_location", "origin", "length_mm", "weight_g", "sea_age_year", "juvenile_age_year", "sex")) %>%bind_rows(fin.sallaa2 %>%# remove individuals without length or sea agedrop_na(length_mm) %>% dplyr::select(sai_cou_code, fi_year, sai_location, origin, length_mm, weight_g,sea_age_year,juvenile_age_year,sex, date)) %>%left_join(SweFin.rivers, by ="sai_location") %>%# prefer the existing AU before the new one.mutate(asses.unit =if_else(is.na(AU), asses.unit, AU)) %>% dplyr::select(!AU)
Joining with `by = join_by(InsamlingID, Serie, InsamlMetod, AnstrTyp,
IndividID, sai_location, RT90_X_Vatten, RT90_Y_Vatten, S99TM_N_Vatten,
S99TM_E_Vatten, WGS84_N_Vatten, WGS84_E_Vatten, Plats, RT90_X_Plats,
RT90_Y_Plats, S99TM_N_Plats, S99TM_E_Plats, WGS84_N_Plats, WGS84_E_Plats,
Subdiv, AU, Syfte, fi_year, FångstDatum, Art, Åldersprov, IndividNr, length_mm,
weight_g, Behandling, Kön, stage, Genprov, Märkning, MärkeNr, Märkning2,
Märke2Nr, origin, juvenile_age_year, sea_age_year, Pluszon, AntalLek,
ÅlderLek1, ÅlderLek2, ÅlderLek3, Tydlighet, AnmÅlder, sai_cou_code, date, sex)`
anti_join: added no columns
> rows only in x 105,846
> rows only in dup.markrec ( 0)
> matched rows ( 3)
> =========
> rows total 105,846
filter: removed 36,152 rows (34%), 69,694 rows remaining
drop_na: removed 3,682 rows (5%), 66,012 rows remaining
mutate: changed one value (<1%) of 'length_mm' (0 new NAs)
drop_na: removed 254 rows (18%), 1,125 rows remaining
drop_na: removed 771 rows (5%), 13,908 rows remaining
left_join: added 6 columns (fisa_y_4326, fisa_x_4326, asses.unit, stock.origin,
stock.unit, …)
> rows only in x 0
> rows only in SweFin.rivers ( 4)
> matched rows 81,051 (includes duplicates)
> ========
> rows total 81,051
mutate: changed 1,215 values (1%) of 'asses.unit' (1,140 fewer NAs)
mutate: changed 27,430 values (>99%) of 'length_mm' (0 new NAs)
mutate: new variable 'asses.unit' (double) with one unique value and 0% NA
new variable 'region' (character) with one unique value and 0% NA
Show code
# fix combine fra.sallaa and fra.sallaa2 fra.sallaa.b <- fra.sallaa %>%rename(gen.sex =`Genetic sex`,field.sex =`Sex observed in the field`) %>%mutate(# remove french accents and hyphenssai_location = stringi::stri_trans_general(sai_location, "Latin-ASCII"),sai_location =str_replace_all(sai_location,"-"," "),# changing tributaries to main sai_location sai_location =case_when(sai_location %in%c("Varenne","Bethune") ~"Arques", sai_location %in%c("Inam") ~"Elle", sai_location %in%c("Austreberthe") ~"Seine", sai_location %in%c("Arroux", "Allier") ~"Loire", sai_location %in%c("Jet","Steir") ~"Odet",.default = sai_location),# Using observed sex in the field when genetic is missing to complete infosex =str_to_lower(if_else(is.na(gen.sex), field.sex, gen.sex)), # correct a "1" valued entry to NAsex =if_else(sex %in%c("f","m"), sex, NA),# calculate total from fork length and correct TL outlier: 51 mm and 2 yolength_mm =if_else(is.na(total_length) | total_length ==51, exp(0.2892351)*fork_length^0.9623479, total_length), # Correct one ind. at 7900 mm and assume it is 790 mmlength_mm =ifelse(length_mm >2000, length_mm/10, length_mm), ) %>%drop_na(length_mm) %>% dplyr::select(sai_cou_code,fi_year,sai_location, origin, length_mm, sea_age_year,juvenile_age_year,sex,date) %>%mutate(weight_g =NA) %>%bind_rows(fra.sallaa2 %>%mutate(origin ="wild",sea_age_year =NA,length_mm =exp(0.2892351)*fork_length^0.9623479,juvenile_age_year =as.numeric(if_else(juvenile_age_year =="NA", NA, juvenile_age_year))) %>% dplyr::select(sai_cou_code, fi_year, sai_location, origin, length_mm, sea_age_year,juvenile_age_year, sex, sai_lfs_code,date) %>%mutate(weight_g =NA)) %>%left_join(fra.rivers2)
rename: renamed 2 variables (field.sex, gen.sex)
mutate: changed 14,902 values (21%) of 'sai_location' (0 new NAs)
new variable 'sex' (character) with 3 unique values and 42% NA
new variable 'length_mm' (double) with 741 unique values and 0% NA
drop_na: no rows removed
mutate: new variable 'weight_g' (logical) with one unique value and 100% NA
mutate: converted 'juvenile_age_year' from character to double (57235 new NA)
new variable 'origin' (character) with one unique value and 0% NA
new variable 'sea_age_year' (logical) with one unique value and 100% NA
new variable 'length_mm' (double) with 207 unique values and 0% NA
mutate: new variable 'weight_g' (logical) with one unique value and 100% NA
Joining with `by = join_by(sai_location)`
left_join: added 5 columns (river, stock.unit, region.gen, fisa_x_4326, fisa_y_4326)
> rows only in x 0
> rows only in fra.rivers2 ( 12)
> matched rows 196,054 (includes duplicates)
> =========
> rows total 196,054
mutate: converted 'fisa_x_4326' from character to double (0 new NA)
changed 2,965 values (100%) of 'length_mm' (0 new NAs)
mutate: new variable 'stock.unit' (character) with one unique value and 0% NA
new variable 'region' (character) with one unique value and 0% NA
mutate: changed 61,411 values (20%) of 'sea_age_year' (0 new NAs)
changed 122,936 values (40%) of 'juvenile_age_year' (53,832 new NAs)
distinct: removed 307,550 rows (>99%), 8 rows remaining
# A tibble: 8 × 3
sea_age_year juvenile_age_year age.type
<dbl> <dbl> <chr>
1 1 1 both
2 1 NA sea.only
3 NA NA <NA>
4 NA 1 juve.only
5 0 0 both
6 0 NA sea.only
7 0 1 both
8 NA NA juve.only
# some fish are too large to be smolts - these need to be correctedall.sallaa3 %>%filter(age.type =="juve.only"& length_mm >=290) %>%distinct(sai_cou_code,sai_location)
# A tibble: 18 × 2
sai_cou_code sai_location
<chr> <chr>
1 SWE Ätran
2 SWE Örekilsälven
3 SWE Göta älv
4 SWE Umeälven
5 SWE Torneälven
6 SWE Piteälven
7 SWE Östersjön (hela) ICES SD 22-32
8 SWE Västerhavet (hela) ICES SD 20-21
9 SWE Mörrumsån
10 FIN Tornionjoki
11 LV Amata
12 LV Gauja
13 LV Tebra
14 LV Pēterupe
15 LV Vitrupe
16 LV Irbe
17 LV Užava
18 LV Venta
Show code
# Fix ages that are not correctly classifiedall.sallaa4 <- all.sallaa3 %>%# set sea age = juve age mutate(sea_age_year =case_when(age.type =="juve.only"& length_mm >=290~ juvenile_age_year,.default = sea_age_year),# set juven. age = NA, juvenile_age_year =case_when(age.type =="juve.only"& length_mm >=290&!is.na(juvenile_age_year) ~NA,.default = juvenile_age_year),# reclassify their age typesage.type =case_when(age.type =="juve.only"& length_mm >=290~"sea.only",.default = age.type)) %>%# one "both" that should be sea.anlymutate(age.type =if_else(age.type =="both"& length_mm >=500& tot_age_year ==0, "sea.only", age.type) )
mutate: changed 3,026 values (1%) of 'sea_age_year' (3,026 fewer NAs)
changed 3,026 values (1%) of 'juvenile_age_year' (3,026 new NAs)
changed 3,026 values (1%) of 'age.type' (0 new NAs)
mutate: changed one value (<1%) of 'age.type' (0 new NAs)
Show code
# Check. # Id sites with smoltsswj <- all.sallaa4 %>%filter(age.type =="juve.only") %>%distinct(sai_location) %>%pull(sai_location)
3d. Calculate continous age based on day of year integer
To account for growth
Show code
# what day of year are the fish that we have age for caught?all.sallaa4 %>%filter(!is.na(tot_age_year)) %>%mutate(yday =yday(date)) %>%count(yday) %>%# counts per dayggplot(aes(x = yday, y = n, group =1)) +geom_line(color ="steelblue") +geom_vline(xintercept =365/2, color ="red")
filter: removed 91,340 rows (30%), 216,218 rows remaining
mutate: new variable 'yday' (double) with 366 unique values and 1% NA
count: now 366 rows and 2 columns, ungrouped
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Show code
# what day of year are the fish that we DONT have age for caught?all.sallaa4 %>%filter(is.na(tot_age_year)) %>%mutate(yday =yday(date)) %>%count(yday) %>%# counts per dayggplot(aes(x = yday, y = n, group =1)) +geom_line(color ="steelblue") +geom_vline(xintercept =365/2, color ="red")
filter: removed 216,218 rows (70%), 91,340 rows remaining
mutate: new variable 'yday' (double) with 366 unique values and 4% NA
count: now 366 rows and 2 columns, ungrouped
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Show code
# add a decimal to year to be able account for additional growth in th last yearall.sallaa5 <- all.sallaa4 %>%mutate(tot_age_decimyear =yday(date)/365)
mutate: new variable 'tot_age_decimyear' (double) with 367 unique values and 2%
NA
3e. Correct weights
There is no weight measurements in the French length at age data.
Show code
# Swedish weights are in milligrams*10. So divide by 100 to get grams.all.sallaa5 %>%drop_na(tot_age_year) %>%# two na-aged Swedish fish with wrong lw-relationships removedggplot() +geom_point(aes(length_mm, weight_g, color = sai_cou_code)) +facet_wrap(~sai_cou_code, scales ="free") +labs(subtitle ="no weight data on French and Irish fish")
mutate: changed 59,460 values (19%) of 'weight_g' (0 new NAs)
Show code
# light weight ones in Latvia, CHECK these with Janis at BIORall.sallaa6 %>%drop_na(tot_age_year) %>%filter(sai_cou_code =="LV", weight_g <100, length_mm >200)
# A tibble: 1,101 × 23
sai_cou_code fi_year sai_location origin length_mm weight_g sea_age_year
<chr> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
1 LV 1999 Amata wild 660. 3.11 0
2 LV 1999 Amata wild 751. 5.09 0
3 LV 1999 Amata wild 670. 3.85 0
4 LV 1999 Amata wild 821. 6.15 0
5 LV 1999 Amata wild 771. 4.87 0
6 LV 1999 Amata wild 680. 3.78 0
7 LV 2006 Amata wild 821. 7.4 0
8 LV 2006 Amata wild 781. 6 0
9 LV 2006 Amata wild 620. 3.2 0
10 LV 2006 Amata wild 680. 3.6 0
# ℹ 1,091 more rows
# ℹ 16 more variables: juvenile_age_year <dbl>, sex <chr>, date <dttm>,
# fisa_y_4326 <dbl>, fisa_x_4326 <dbl>, asses.unit <dbl>, stock.origin <chr>,
# stock.unit <chr>, region <chr>, sai_lfs_code <chr>, river <chr>,
# region.gen <chr>, spat.unit <chr>, age.type <chr>, tot_age_year <dbl>,
# tot_age_decimyear <dbl>
Show code
# still some odd l-w:s to fix if weight is used. all.sallaa6 %>%drop_na(tot_age_year) %>%ggplot() +geom_point(aes(length_mm, weight_g, color = sai_cou_code))
Warning: Removed 159492 rows containing missing values or values outside the scale range
(`geom_point()`).
3d. Map and summary of length at age
Show code
# length at age by a suitable spatial aggregation (spat.unit) is defined as French regions, Swedish west coast and Baltic assessment units. This results in 10 spatial units (plus an NA group which will disappear when the French region information is complete) which should represent genetic and ecological units. There are observations without assessment units in the Baltic as the only spatial information we have is that they are from the Baltic as a whole. all.sallaa6 %>%ggplot(aes(sea_age_year, length_mm, color = spat.unit)) +geom_point() +facet_wrap( ~spat.unit)
Warning: Removed 182270 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning in plot_theme(plot): The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.
Show code
ggsave("laa_map.png", scale =0.8)
Saving 5.6 x 4 in image
Warning in plot_theme(plot): The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.
Warning in plot_theme(plot): The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.
Show code
# ind counts by spatial unitall.sallaa6 %>%#filter(age.type == "juve.only") %>%mutate(river1 =paste0(sai_cou_code,":",sai_location)) %>%summarise(count =n(), .by =c(fi_year, sai_cou_code, river1)) %>%mutate(count.ind =as.factor(ifelse(count >50, ">50",ifelse(count >30& count <=50, ">30",ifelse(count >10& count <=30, ">10","1 - 10")))),count.ind =fct_reorder(count.ind, count)) %>%#filter(sai_cou_code == "FRA") %>%ggplot(aes(fi_year, river1, fill = count.ind, group = sai_cou_code)) +geom_tile(color ="gray30") +scale_fill_viridis_d() +theme_light() +theme(axis_text=element_text(size=5)) +labs(title ="Length at age individuals by river and year",subtitle ="Juveniles only")
mutate: new variable 'river1' (character) with 119 unique values and 0% NA
summarise: now 1,796 rows and 4 columns, ungrouped
mutate: new variable 'count.ind' (factor) with 4 unique values and 0% NA
Warning in plot_theme(plot): The `axis_text` theme element is not defined in
the element hierarchy.
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_tile()`).
mutate: new variable 'river1' (character) with 119 unique values and 0% NA
summarise: now 1,796 rows and 4 columns, ungrouped
mutate: new variable 'count.ind' (factor) with 4 unique values and 0% NA
filter: removed 1,287 rows (72%), 509 rows remaining
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_tile()`).
mutate: new variable 'river1' (character) with 119 unique values and 0% NA
summarise: now 1,796 rows and 4 columns, ungrouped
mutate: new variable 'count.ind' (factor) with 4 unique values and 0% NA
filter: removed 509 rows (28%), 1,287 rows remaining
By suggested spatial aggregation:
Show code
# ind counts by a suggested spatial aggregationall.sallaa6 %>%summarise(count =n(), .by =c(fi_year, spat.unit)) %>%mutate(Ind.count =as.factor(ifelse(count >50, ">50",ifelse(count >30& count <=50, ">30",ifelse(count >10& count <=30, ">10","1 - 10")))),Ind.count =fct_reorder(Ind.count, count)) %>%ggplot(aes(fi_year, spat.unit, fill = Ind.count)) +geom_tile(color ="gray30") +scale_fill_viridis_d() +labs(y ="") +theme_light()
summarise: now 485 rows and 3 columns, ungrouped
mutate: new variable 'Ind.count' (factor) with 4 unique values and 0% NA
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_tile()`).
Show code
ggsave("laa_temp.png", scale =0.8)
Saving 5.6 x 4 in image
Warning: Removed 2 rows containing missing values or values outside the scale range
(`geom_tile()`).
Show code
# sex information by suggested spatial regionall.sallaa6 %>%summarise(ind.count =n(), .by =c(spat.unit, sex)) %>%ggplot(aes(ind.count, sex, fill = spat.unit)) +geom_bar(stat="identity", position ="stack") +scale_fill_viridis_d()
summarise: now 39 rows and 3 columns, ungrouped
Show code
# Number of individuals by suggested spatial regionall.sallaa6 %>%summarise(ind.count =n(), .by =c(fi_year, spat.unit)) %>%ggplot(aes(ind.count, spat.unit)) +geom_bar(stat="identity", position ="stack", fill ="#440154FF") +geom_text(aes(label = ind.count), nudge_x =2400)
summarise: now 485 rows and 3 columns, ungrouped
4. Fecundity at length
4a. Filter, clean and combine data
Remove NA values in length and fecundity (n.eggs)
Correct an obvious outlier in the Finnish data 1
Calculate total length (\(L_t\))from fork length (\(L_f\)) where needed in the French data based on the model: \(exp(0.05704206)*L_f^{0.99670332}\) (see below).
The kept variables are: length, sai_location, sai_cou_code, origin, fi_year, n.eggs. Spatial variables are added to the data from the tables created in 2.
Show code
# Model to convert French fork lengths in the fecundity data to total lengthsfra.salfec %>%drop_na(length.t, length.f) %>%lm(log(length.t) ~log(length.f), data = .) %>%tidy() %>%pull(estimate)
fra.salfec %>%drop_na(length.t, length.f) %>%ggplot(aes(length.f, length.t)) +geom_point() +geom_line(aes(x = length.f, y =exp(0.05704206)*length.f^0.99670332), col ="red") +labs(title ="fork to total length fit fecundity")
Warning in plot_theme(plot): The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.
Show code
ggsave("fec_map.png", scale =0.8)
Saving 5.6 x 4 in image
Warning in plot_theme(plot): The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.
The `tagger.panel.tag.text` theme element is not defined in the element
hierarchy.